Analysis of the algorithm: From kernels to backup genes.

Kernelization section

The algorithm transformed the semantic similarity matrix to make it compatible with a kernel. Once this was done for each network and kernel type, it was integrated by kernel type. Below there is a general analysis of the properties of each matrix in the different phases of the process.

Annotations properties

Table 1. Annotation files descriptors

Net Min Max Average Standard_Deviation
biological_process 1 17 4.25 5.339653992713831
cellular_component 1 8 3.2666666666666666 3.809345233909774
disease 1 8 1.96 2.4738633753705965
molecular_function 1 7 2.397727272727273 2.744829850663044
phenotype 1 251 35.3 53.70754136990447

Matrix properties

Table 2. Similarity matrixes

Net Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process_sim 84x84 7056 6046
cellular_component_sim 90x90 8100 8010
disease_sim 100x100 10000 8818
molecular_function_sim 88x88 7744 7656
phenotype_sim 100x100 10000 9900

Table 3. Filtered similarity matrixes

Table 4. Uncombined kernel matrixes

Net Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process ct 84x84 7056 7056
biological_process el 84x84 7056 7056
biological_process ka 84x84 7056 6130
biological_process rf 84x84 7056 7056
cellular_component ct 90x90 8100 8100
cellular_component el 90x90 8100 8100
cellular_component ka 90x90 8100 8100
cellular_component rf 90x90 8100 8100
disease ct 100x100 10000 10000
disease el 100x100 10000 10000
disease ka 100x100 10000 8918
disease rf 100x100 10000 10000
molecular_function ct 88x88 7744 7744
molecular_function el 88x88 7744 7744
molecular_function ka 88x88 7744 7744
molecular_function rf 88x88 7744 7744
phenotype ct 100x100 10000 10000
phenotype el 100x100 10000 10000
phenotype ka 100x100 10000 10000
phenotype rf 100x100 10000 10000

Table 5. Integrated kernel matrixes

Integration Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
integration_mean_by_presence ct 284x284 80656 29308
integration_mean_by_presence el 284x284 80656 29308
integration_mean_by_presence ka 284x284 80656 28172
integration_mean_by_presence rf 284x284 80656 29308
mean ct 284x284 80656 29308
mean el 284x284 80656 29308
mean ka 284x284 80656 28172
mean rf 284x284 80656 29308

Weight values